GPLSIUA: Combining Temporal Information and Topic Modeling for Cross-Document Event Ordering

نویسندگان

  • Borja Navarro-Colorado
  • Estela Saquete Boró
چکیده

Building unified timelines from a collection of written news articles requires cross-document event coreference resolution and temporal relation extraction. In this paper we present an approach event coreference resolution according to: a) similar temporal information, and b) similar semantic arguments. Temporal information is detected using an automatic temporal information system (TIPSem), while semantic information is represented by means of LDA Topic Modeling. The evaluation of our approach shows that it obtains the highest Micro-average F-score results in the SemEval2015 Task 4: “TimeLine: Cross-Document Event Ordering” (25.36% for TrackB, 23.15% for SubtrackB), with an improvement of up to 6% in comparison to the other systems. However, our experiment also showed some drawbacks in the Topic Modeling approach that degrades performance of the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Document Event Ordering through Temporal Relation Inference and Distributional Semantic Models∗ Ordenación de eventos multidocumento usando inferencia de relaciones temporales y modelos semánticos distribucionales

This paper focuses on the contribution of temporal relations inference and distributional semantic models to the event ordering task. Our system automatically builds ordered timelines of events from different written texts in English by performing first temporal clustering and then semantic clustering. In order to determine temporal compatibility, an inference from the temporal relationships be...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Content Modeling Using Latent Permutations

We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that...

متن کامل

Content Modeling Using Latent Permutations Citation

We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that...

متن کامل

Multi-document Summarization Based on Atomic Semantic Events and Their Temporal Relationships

Automatic multi-document summarization (MDS) is the process of extracting the most important information such as events and entities from multiple natural language texts focused on the same topic. We extract all types of semantic atomic information and feed them to a topic model to experiment with their effects on a summary. We design a coherent summarization system by taking into account the s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015